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Abstract 

This paper addresses the problem of stably recovering sparse or compressible signals from compressed 
sensing measurements that have undergone optimal non-uniform scalar quantization, i.e., minimizing the 
common £2 -norm distortion. Generally, this Quantized Compressed Sensing (QCS) problem is solved 
by minimizing the ^i-norm constrained by the ^2-norm distortion. In such cases, re-measurement and 
quantization of the reconstructed signal do not necessarily match the initial observations, showing that 
the whole QCS model is not consistent. Our approach considers instead that quantization distortion more 
closely resembles heteroscedastic uniform noise, with variance depending on the observed quantization 
bin. Generalizing our previous work on uniform quantization, we show that for non-uniform quantizers 
described by the "compander" formalism, quantization distortion may be better characterized as having 
bounded weighted £ p -norm (p ^ 2), for a particular weighting. We develop a new reconstruction approach, 
termed Generalized Basis Pursuit DeNoise (GBPDN), which minimizes the ^i-norm of the signal to 
reconstruct constrained by this weighted £ p -norm fidelity. We prove that, for standard Gaussian sensing 
matrices and K sparse or compressible signals in R N with at least Q((K log N/ K) p / 2 ) measurements, 
i.e., under strongly oversampled QCS scenario, GBPDN is I2 — ^1 instance optimal and stable recovers 
all such sparse or compressible signals. The reconstruction error decreases as 0(2~ B / \Jp + 1) given 
a budget of B bits per measurement. This yields a reduction by a factor \Jp + 1 of the reconstruction 
error compared to the one produced by ^ 2 - norm constrained decoders. We also propose an primal-dual 
proximal splitting scheme to solve the GBPDN program which is efficient for large-scale problems. 
Interestingly, extensive simulations testing the GBPDN effectiveness confirm the trend predicted by the 
theory, that the reconstruction error can indeed be reduced by increasing p, but this is achieved at a much 
less stringent oversampling regime than the one expected by the theoretical bounds. Besides the QCS 
scenario, we also show that GBPDN applies straightforwardly to the related case of CS measurements 
corrupted by heteroscedastic Generalized Gaussian noise with provable reconstruction error reduction. 



I. Introduction 

A. Problem statement 

Measurement quantization is a critical step in the design and in the dissemination of new technologies 
implementing the Compressed Sensing (CS) paradigm. Quantization is indeed mandatory for transmitting, 
storing and even processing any data sensed by a CS device. 

In its most popular version, CS provides uniform theoretical guarantees for stably recovering any sparse 
(or compressible) signal at a sensing rate proportional to the signal intrinsic dimension {i.e., its sparsity 
level) Q] 121 . However, the distortion introduced by any quantization step is often still crudely modeled 
as a noise with bounded ^2-norm. 
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Such an approach results in reconstruction methods aiming at finding a sparse signal estimate for which 
the sensing is close, in a ^2-sense, to the available quantized signal observations. However, earlier works 
have pointed out that this method is not optimal. For instance, ifTTIl analyses the error achieved when 
a signal is reconstructed from its quantized coefficients in some overcomplete expansion. Translated to 
our context, this amounts to the ideal CS scenario where some oracle provides us the true signal support 
knowledge. In this context, a linear least square (LS) reconstruction minimizing the ^-distance in the 
coefficient domain is inconsistent and has a mean square error (MSE) decaying, at best, as the inverse 
of the frame redundancy factor. Interestingly, any consistent reconstruction method, i.e., for which the 
quantized coefficients of the reconstructed signal match those of the original signal, shows a much better 
behavior since its MSE is in general lower-bounded by the inverse of the squared frame redundancy; 
this lower bound being attained for specific overcomplete Fourier frames. 

A few other works in the Compressed Sensing literature have also considered the quantization distortion 
differently. In [3 ], an adaptation of both Basis Pursuit DeNoise (BPDN) program and the Subspace Pursuit 
algorithm integrates an explicit constraint enforcing consistency. In [5], nonuniform quantization noise 
and Gaussian noise in the measurements before quantization are properly dealt with using an l\ -penalized 
maximum likelihood decoder. 

Finally, in lH|6l|7l, the extreme case of 1-bit CS is studied, i.e., when only the signs of the measurements 
are sent to the decoder. These works have shown that consistency with the 1-bit quantized measurements 
is of paramount importance for reconstructing the signal where straightforward methods relying on £2 
fidelity constraints reach poor estimate quality. 



B. Contributions 

The present work addresses the problem of recovering sparse or compressive signals in a given 
non-uniform Quantized Compressed Sensing (QCS) scenario. In particular, we assume that the signal 
measurements have undergone an optimal non-uniform scalar quantization process, i.e., optimized a priori 
according to a common minimal distortion standpoint with respect to a source with known probability 
density function (pdf). This post-quantization reconstruction strategy, where only increasing the number 
of measurements can improve the signal reconstruction, is inspired by other works targeting consistent 
reconstruction approaches in comparison with methods advocating solutions of minimal ^-distortion 
El |U Our work is therefore distinct from approaches where other quantization schemes (e.g., 
XA-quantization [13]) are tuned to the global CS formalism or to specific CS decoding schemes (e.g., 
Message Passing Reconstruction lfT2l ). These techniques often lead to signal reconstruction MSE rapidly 
decaying with the measurement number M - for instance, a r-order SA-quantization of CS measurements 
combined with a particular reconstruction procedure has a MSE decaying nearly as 0(M _r+ 2 ) |T3l - but 
their application involves generally more involved quantization strategies at the CS encoding stage. 

This paper also generalizes the results provided in to cover the case of non-uniform scalar quantiza- 
tion of CS measurements. We show that the theory of "Companders" O provides an elegant framework 
for stabilizing the reconstruction of a sparse (or compressible) signal from non-uniformly quantized CS 
measurements. Under the High Resolution Assumption (HRA), i.e., when the bit budget of the quantizer 
is high and the quantization bins are narrow, the compander theory provides an equivalent description of 
the action of a quantizer through sequential application of a compressor, a uniform quantization, then an 



expander (see Section II- A for details). As will be clearer later, this equivalence allows us to define new 
distortion constraints for the signal reconstruction which are more faithful to the non-uniform quantization 
process given a certain QCS measurement regime. 
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Algorithms for reconstructing from quantized measurements commonly rely on mathematically de- 
scribing the noise induced by quantization as bounded in some particular norm. A data fidelity constraint 
reflecting this fact is then incorporated in the reconstruction method. Two natural examples of such 
constraints are that the £2 -norm be bounded, or that the quantization error be such that the unquantized 
values lie in specified, known quantization bins. In this paper, guided by the compander theory, we show 
that these two constraints can be viewed as special (extreme) cases of a particular weighted ^ p -norm, 
which forms the basis for our reconstruction method. The weights are determined from a set of p-optimal 
quantizer levels, that are computed from the observed quantized values. We draw the reader attention to 
the fact these weights do not depend on the original signal which is of course unknown. They are used 
only for signal reconstruction purposes, and are optimized with respect to the weighted norm. In the QCS 
framework, and owing to the particular weighting of the norm, each quantization bin contributes equally 
to the related global distortion. 

Thanks to a new estimator of the weighted ^ p -norm of the quantization distortion associated to these 
particular levels (see Lemma [3]), and with the proviso that the sensing matrix obeys a generalized 
Restricted Isometry Property (RIP) expressed in the same norm (see (p"4])), we show that solving a General 
Basis Pursuit DeNoising program (GBPDN) - an l\ -minimization problem constrained by a weighted 
£ p -norm whose radius is appropriately estimated - stably recovers strictly sparse or compressible signals 
(see Theorem [T]). 

We also quantify precisely the reconstruction error of GBPDN as a function of the quantizer bit rate 
(under the HRA) for any value of p in the weighted £ p constraint. These results reveal a set of conflicting 
considerations for setting the optimal p. On the one hand, given a budget of B bits per measurement 
and for a high number of measurements M, the error decays as 0{2~ B / yjp + 1) when p increases (see 
Proposition [3J, i.e., a favorable situation since then GBPDN tends also to a consistent reconstruction 
method. On the other hand, the larger p, the greater the number of measurements required to ensure that 
the generalized RIP is fulfilled. In particular, one needs VL{(K log N / K) v I 2 ) measurements compared to 
a^2-based CS bound of f2(if log N/K) measurements (see Proposition [I]). Put differently, given a certain 
number of measurements, the range of theoretically admissible p is upper bounded, an effect which is 
expected since the error due to quantization cannot be eliminated in the reconstruction. 

In fact, the stability of GBPDN in the context of QCS is a consequence of a an even more general 
stability result that holds for a broader class additive heteroscedastic measurement noise having a bounded 
weighted l v norm. This for instance covers the case of heteroscedastic Generalized Gaussian noise where 
the constraint of GBPDN can be interpreted as a (variance) stabilization of the measurement distortion, 



see Section HI-Cl. 



C. Relation to prior work 

Our work is novel in several respects. For instance, as stated above, the quantization distortion in the 
literature is often modeled as a mere Gaussian noise with bounded variance [3]. In |8], only uniform 
quantization is handled and theoretically investigated. In [ 5 ] , nonuniform quantization noise and Gaussian 
noise are handled but theoretical guarantees are lacking. To the best of our knowledge, this is the first work 
thoroughly investigating the theoretical guarantees of l\ sparse recovery from non-uniformly quantized CS 
measurements, by introducing a new class of convex t\ decoders. The way we bring the compander theory 
in the picture to compute the optimal weights from the quantized measurements is also an additional 
originality of this work. 
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D. Paper organization 

The paper is organized as follows. In Section |n| we recall the theory of optimal scalar quantization seen 
through the compander formalism. We then explain how this point of view can help us in understanding 
the intrinsic constraints that quantized CS measurements must satisfy, and we introduce a new distortion 



measure, the p-Distortion Consistency, expressed in terms of a weighted ^ p -norm. Section III introduces 
the GBPDN CS class of decoders integrating weighted ^-constraints, and describes sufficient conditions 
for guaranteeing reconstruction stability. This section shows also the generality of this procedure for sta- 



bilizing additive heteroscedastic GGD measurement noise during the signal reconstruction. In Section IV 



we explain how GBPDN can be used for reconstructing a signal in QCS when its fidelity constraint is 



adjusted to the parameters defined in Section II-C We show that this specific choice leads to a (variance) 
stabilization of the quantization distortion forcing each quantization bin to contribute equally to the 
overall distortion error. In Section [Vj we describe a provably convergent primal-dual proximal splitting 
algorithm to solve the GBPDN program, and demonstrate the power of the proposed approach with 
several numerical experiments on sparse signals. 



E. Notation 

All finite space dimensions are denoted by capital letters {e.g., K, M, N, D G N), vectors (resp. 
matrices) are written in small (resp. capital) bold symbols. For any vector u, the £ p -norm for 1 ^ p < oo 
is \\u\\ p = (Yli \ u i\ p ) 1 ^ P > as usual ||tt||oo = maxj \u{\ and we write = 1 1 tt 1 1 2 - We write ||it||o = #{i : 
ui 7^ 0}, which counts the number of non-zero components. We denote the set of K-sparse vectors in the 
canonical basis by T,k = {u G R N : \\u\\q ^ K}. When necessary, we write & as the normed vector 
space (IR D , || • || p ). 

The identity matrix in R D is written 1 d (or simply 1 if the D is clear from the context). U = diag(iz) 
is the diagonal matrix with diagonal entries from u, i.e., Uij = Ui5ij. Given the iV-dimensional signal 
space R N , the index set is [N] = {1, • • • , N}, and 3>/ G ]R Mx # 7 is the restriction of the columns of $ to 
those indexed in the subset I C [N], whose cardinality is #7. Given x G R , 2:* stands for the best K- 
term ^-approximation of x in the orthonormal basis * G M. NxN , that is, 2:* = \E r (argmin{||a; — : 
C G WL N , \\C\\o ^ K})- When * = 1, we write x^ = x\ with ||a;#-||o ^ K. A random matrix 
* ~ M MxN (0, 1) is a M x N matrix with entries *^ ~ iid M(0, 1). The 1-D Gaussian pdf of mean 
/jgl and variance a 2 G IR+ is denoted 7 M)(J (t) := (2vrcr 2 )~ 1/2 exp(-^^). 

For a function / : R R, we write |||/||| g := (/ R dt \f(t)n^, with \jf\U ■= sup teK \f(t)\. 

In order to state many results which hold asymptotically as a dimension D G R increases, we will use 
the common Landau family of notations, i.e., the symbols O, Q,, 0, o, and to (their exact definition can be 
found in 03). Additionally, for f,g G C 1 (IR + ), we write f(D) ~ D g(D) when f(D) = g(D)(l + o(l)). 
We also introduce two new asymmetric notations dealing with asymptotic quantity ordering, i.e., 

f(D) <d g{D) & 3 6 : R -> R + : f(D) + 5(D) ~ D g(D) 
f(D) > D g(D) & -f(D) < D -g{D). 

If any of the asymptotic relations above hold with respect to several large dimensions D\ , D2 , • • • , we 
write —d u d 2 ,- an d correspondingly for < and >. 

II. Non-Uniform Quantization in Compressed Sensing 

Let us consider a signal x G R N to be measured. We assume that it is either strictly sparse or 
compressible, in a prescribed orthonormal basis * = (^1, ••• , *jv) 6 R NxN . This means that the 
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signal x = = ^ - ^jCj is sucn that the ^-approximation error ||C — Ck\\ = \\ x ~ x ^k\\ quickly 
decreases (or vanishes) as K increases. For the sake of simplicity, and without loss of generality, the 
sparsity basis is taken in the sequel as the standard basis, i.e., ^ = t, and £ is identified with x. All the 
results can be readily extended to other orthonormal bases V£ ^ 1. 

In this paper, we are interested in compressively sensing x G W N with a given measurement matrix 
<I> G M MxAr . Each CS measurement, i.e., each entry of z = $>x, undergoes a general scalar quantization. 
We will assume this quantization to be optimal relative to a known distribution of each entry Zj. For 
simplicity, we only consider matrices $ that yield Z{ to be i.i.d. M(0, <Tq) Gaussian, with pdf ipo := 7o,o- - 
This is satisfied, for instance, if $ ~ M MxN (0, 1), with <7 = ||a;||2- When * = [cpj, ••• 1 ^p\, I ] T is 
a (fixed) realization of Af MxN (0, 1), the entries zj = ((pj,x) of the vector z = <&x are M (fixed) 
realizations of the same Gaussian distribution M(0, \\x\\ 2 ). It is therefore legitimate to quantize these 
values optimally using the normality of the source^ 

Our quantization scenario uses a B-bit quantizer Q which has been optimized with respect to the 
measurement pdf ipo for B = 2 B = levels Q = {uj^ : 1 ^ k ^ £>} and thresholds {tk : 1 ^ k ^ ,6+1} 
with —t\ = ts+i = +oo. Unlike the framework developed in 0, our sensing scenario considers that 
any noise corrupting the measurements before quantization is negligible compared to the quantization 
distortion. 

Consequently, given a measurement matrix $ G R MxN , our quantized sensing model is 

y = Q[& X ] = Q[z] EQ M . (1) 

Following recent studies ||3] [8j [T3J in the CS literature, this work is interested in optimizing the signal 
reconstruction stability from y under different sensing conditions, for instance, when the oversampling 
ratio M/K is allowed to be large. Before going further into this signal sensing model, let us describe 
first the selected quantization framework. The latter is based on a scalar quantization of each component 
of the signal measurement vector. 



A. Quantization, Companders and Distortion 

A scalar quantizer Q is defined from B = 2 B levels u>k (coded by B = log 2 B bits) and B + 1 
thresholds tfc £ I U = ^> with ujk < ^k+i and tk ^ < tk+i for all 1 ^ k ^ B. The k th 

quantizer bin (or region) is TZk = [tk,tk+i), with bin width Tk = tk+i — tk- The quantizer Q is a map: 
E — > Jl = {tOk : 1 ^ k ^ £>}, 1 1 — y Q[t] = LOk t G TZk- An optimal scalar quantizer Q with respect 

to a random source Z with pdf ipz is such that the distortion ¥,\Z — Q[Z]\ 2 is minimized. Optimal levels 
and thresholds can be calculated for a fixed number of quantization bins by the Lloyd-Max Algorithm 
lfT6l [T71 . or by an asymptotic (with respect to B) companding approach J9J. 

Throughout this paper, we work under the HRA. This means that, given the source pdf Lpz, the number 
of bits B is sufficient to validate the approximation 

<pz(t) ~s <pz(w h ), Vi G TZh- (HRA). 

A common argument in quantization theory |9[ states that under the HRA, every optimal regular 
quantizer can be described by a compander (a portemanteau for "compressor" and "expander"). More 
precisely, we have 

Q = r 1 oQ a oe, 

'Avoiding pathological situations where x is adversarially forged knowing <1? for breaking this assumption. 
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[0, 1] a bijective function called the compressor, Q a a uniform quantizer of the interval 



called the expander. 



with Q : E 

[0, 1] of bin width a = 2~ B , and the inverse mapping G v : [0, 1] 

For optimal quantizers the compressor Q maps the thresholds {t^ : 1 ^ k ^ £>} and the levels {uju} 
into the values 

G{t k ) = {k- l)oc, io' k := g(u k ) = {k- l/2)a, (2) 



t' k :-- 



and under the HRA the optimal Q satisfies 



-i 



^ /3 (A). 



(3) 



Intuitively, the function Q', also called quantizer point density function (qpdf) |9), relates the quantizer bin 
widths before and after domain compression by Q. Indeed, under HRA, we can show that G'(\) ^ ol/tu 
if A G 7^^. We will see later that this function is the key to conveniently weight some new quantizer 
distortion measures. 

We note that, for <pz(t) = Jo,a(t) with cumulative distribution function 4>z(X;a 2 ) = |erfc(— so 
that ^{X'-a 2 ) = fr^/2err 1 (2A' - 1), we have G{X) = (/) Z (X;3a 2 ) and <? _1 (A') = ^(A'; 3ct 2 ).' T 

The application of £ modifies the source Z such that — G{Q[Z}) behaves more like a uniformly 
distributed random variable over [—a/2, a/2]. The compander formalism predicts the distortion of optimal 
scalar quantizer under HRA. For high bit rate B, the Panter and Dite formula lfl"8l states that 

3 



E\Z-Q[Z}\< 



B 



12 



G'(t)- 2 tp z (t) dt 



12 



12 



Vpz III i/3- ( 4 ) 



Finally, we note that by the construction defined in Q, the quantized values Q[X] satisfy 

\G(X)-g(Q[X])\ < a/2, VA e R. 



(5) 



We describe in the next sections how ([5]) and (|4]) may be viewed as two extreme cases of a general class 
of constraints satisfied by a quantized source Z. 



B. Distortion and Quantization Consistency 

Let us consider the sensing model ([!]), for which the scalar quantizer Q and associated compressor G 
are optimal relative to the measurements z = &x whose entries zi are iid realizations of A/"(0,<7g). In 
the compressor domain we may write 

G(y) = G(z) + (G(Q[z])-G(z)) = G(z)+e, 

where e represents the quantization distortion. (|5]) then shows that 

||e||oo = \\G(Q[z]) -£(z)||oo < «/2. 

Naively, one may expect any reasonable estimate x* of x (obtained by some reconstruction method) 
to reproduce the same quantized measurements as originally observed. Inspired by the terminology 
introduced in iPTOl ITTl . we say that x* satisfies the quantization consistency (QC) if Q{&x*} = y. 
From the previous reasoning this is equivalent to 

\\G{®x*)-G{y)\\oo S$ e QC :=oc/2. (QC) 

At first glance, it is tempting to try to impose directly QC in the data fidelity constraint. However, as 
will be revealed by our analysis, directly imposing QC does not lead to an effective QCS reconstruction 
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algorithm. This counterintuitive effect, already observed in the case of signal recovery from uniformly 
quantized CS [8 ], is due to the specific requirements that the sensing matrix should respect to make such 
a consistent reconstruction method stable. 

In contrast the Basis Pursuit DeNoise (BPDN) program [19] enforces a constraint on the £2 norm of 
the reconstruction quantization error, which we will call distortion consistency. For BPDN, the estimate 
x* is provided by 

x* € Argmin ||w||i s.t. \\y — &u\\ ^ 6dC; 

u£R N 

where the bound e^ c := M cr^ 2~ 2B is dictated by the Panter-Dite formula. According to the Strong 
Law of Large Numbers (SLLN) obeyed by the HRA, and since z% are iid realizations of Z ~ AA(0, ctq), 
the following holds almost surely 

^||z-g[*]|| 2 ^ E\Z-Q[Z]\" * ^|Mli/3= ^^o2- 2B . (6) 

Accordingly, we say that any estimate x* satisfies distortion consistency (DC) if 

||*sb*-»|| <e DC . (DC) 

However, as stated for the uniform quantization case in JSJ, DC and QC do not imply each other. In 
particular, the output x* of BPDN needs not satisfy quantization consistency. A major motivation for 
the present work is the desire to develop provably stable QCS recovery methods based on measures of 
quantization distortion that are as close as possible to QC. 



C. p-Distortion Consistency 

This section shows that the QC and DC constraints may be seen as limit cases of a weighted £ p -norm 
description of the quantization distortion. The expression of the appropriate weights in the weighted i p 
norm will depend both on the p-optimal quantizer levels, described below, and of the quantizer point 



density function Q' introduced in Section II-A 



For the Gaussian pdf (po = 7o, CTo , given a set of thresholds {t k : 1 ^ k ^ B}, we define the p-optimal 
quantizer levels uj ktP £ K as 

u ktP := argmin / \t - \\ p ip (t) dt, (7) 

for 2 ^ p < 00, and o<Jk,oo '■= |(£fe + *fc+i)- These generalized levels were for instance already defined by 
Max in his minimal distortion study ifTTl . and their definition ^ is also related to the concept of minimal 
p th -power distortion O. For p = 2, we find the definition of the initial quantizer levels, i.e., cjf-i = uj^. 
In this paper, we always assume that p is a positive integer but all our analysis can be extended to the 
positive real case. As proved in Appendix [B] the p-optimal levels are well-defmed. 

Lemma 1 (p-optimal Level Well-Definiteness). The p-optimal levels uJk,p ® re uniquely defined. Moreover, 
for ctq > 0, lim p ^ +00 u k , p = u k)00 , with \u kjP \ = Sl(y/p) for k G {1, B}. 

Using these new levels, we define the (suboptimal) quantizers Q p (with Q2 = Q) such that 

Q P [t] = u k , P O teU k = Q;Vfc, P ] = Q~Vfc]- (8) 

Two important points must be explained regarding the definition of Q p . First, the (re)quantization of 
any source Z with Q p is possible from the knowledge of the quantized value Q[Z], as Q P [Z] = Q P [Q[Z]} 
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since both quantizers share the same decision thresholds. Second, despite the sub-optimality of Q p relative 
to the untouched thresholds {t k : 1 ^ k ^ £>}, we will see later that introducing this quantizer provides 
improvement in the modeling of Q P [Z] — Z by a Generalized Gaussian Distribution (GGD) in each 
quantization bin. 

Remark 1. Unfortunately, there is no closed form formula for computing LUkp- However, as detailed 
in Appendix [7/] they can be computed up to numerical precision using Newton 's method combined with 
simple numerical quadrature for the integral in ([7]). 

Given p ^ 2 and for high B, the asymptotic behavior of a quantizer Q p and of its p th power distortion 
f n \t — LOk, p \ p <fo(t) dt in each bin 1Z k follows two very different regimes in R governed by a particular 
transition value T = Q(y/B~). This is described in the following lemma (proved in Appendix |c|), which, 
to the best of our knowledge, provides new results and may be of independent interest for characterizing 
Gaussian source quantization (even for the standard case p = 2). 

Lemma 2 (Asymptotic p-Quantization Characterization). Given the Gaussian pdf tp and its associated 
compressor Q function, choose < f3 < 1 and p G N, and define the transition value 

T = T(B) = (6 a 2 (log 2^) S) 1 / 2 . 

T defines two specific asymptotic regimes for the quantizer Q p : 

1) The vanishing bin regime T = [-T, T]: for all lZ k C T and any c G lZ k , the bin widths decay as 
Tfc = 0(2~^~^ B ), and the the related p th -power distortion and qpdf asymptotically obey 

J n Jt - U k JP ipo(t) dt ~ B (p+l) 2P (c), (9) 

Q\c) ~ B f k . (10) 

2) The vanishing distortion regime T c : we have Q'(t) Q'{T{B)) = e(2" /3B ) for all t G T c . 
Moreover, the number of bins in T c and their p th -power distortion decay, respectively, as 

#{k : K k C T c } = e(B-^ 2 2^-^ B ), (11) 
f \t-u k:P \P Mt)dt = 0(B-^ +i y 2 2-'^ B ), VK k cT c . (12) 

We now state an important result, proved in Appendix [D] from the statements of Lemma [2] which, 
together with the SLLN, estimates the quantization distortion of Q p on a random Gaussian vector. Given 
p ^ 1 and some positive weights w = (w±, ■ ■ ■ ,wm) T G , this distortion is measured by a weighted 
^p-norm defined a^] ||w||p,xo := || diag(iu) v\\ p for any v G IR M . 

Lemma 3 (Asymptotic Weighted ^-Distortion). Let z G M M be a random vector where each component 
z i ~iid <Po- Given the optimal compressor function Q associated to cpo and the weights w = w{p) such 
that Wi(p) = Q'(Q p [zi]) 2 ^ p for p ^ 2, the following holds almost surely 

\\Q P [z]-z\\ p p>w ^ =: 4, (13) 

With 1 9^0 1 1/3 = 27T<7o 3 3 / 2 . 

This lemma provides a tight estimation for p = 2 and p — > +oo. Indeed, in the first case w = 1 and the 
bound matches the Panter-Dite estimation ([6]). For p — > oo, we observe that = 2~( B+1 ^ = a/2 = 6qc- 

2 A more standard weighted ^ p -norm definition reads Q^. u)i|ui| p ) 1 ^ p . Our definition choice, which is strictly equivalent, 
offers useful writing simplifications, e.g., when observing that ||4>a;||p iU , = ||<l?'a;||p with = diag(iu)<I>. 
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Fig. 1: Comparing the theoretical bound e p to the empirical mean estimate of E||Q p [z] — 2|| P ,™ using 1000 trials of Monte-Carlo 
simulations, for each B = 3,4,5). 



Fig. [T] shows how well the e p estimates the distortion [|Qp[«] — z\\ PiW for the weights and the p- 
optimal levels given in Lemma 2. This has been measured by averaging this quantization distortion for 
1000 realizations of a Gaussian random vector ~ Af M (0, 1) with M = 2 10 , p G {2, • • • , 15} and B = 3,4 
and 5. We observe that the bias of e p , as reflected here by the ratio e p 1 E|| Q p [z] — z\\ p _ w , is rather limited 
and decreases when p and B increase with a maximum relative error of about 2.5% between the true 
and estimated distortion at B = 3 and p = 2. 

Inspired by relation (13 1, we say that an estimate x* G R. N of x sensed by the model ([T]) satisfies the 
p-Distortion Consistency (or D p C) if 

\\$x* - Q p [y]\\ p>w < e p , (D P C) 

with the weights wi(p) = G ; (Q P [yi]) {p ~ 2)/p . 

The class of D P C constraints has QC and DC as its limit cases. 

Lemma 4. Given y = Q[&x], we have asymptotically in B 

D 2 C = DC and D^C = QC. 

Proof: Let x* G M N be a vector to be tested with the DC, QC or D p C constraints. The first 
equivalence for p = 2 is straightforward since w(2) = 1, \\&x* — Qp [y]||p,t» = \\$x* — Q [2/] 1 1 2 and 

e 2 = e DC = III Vol 1/3 from @" 

For the second, we use the fact that y = Q[&x] is fixed by the sensing model ([T]). Let us denote 
by k(i) the index of the bin to which Q p [yi] belongs for 1 ^ i ^ M. Since H^scHoo is fixed, and 
because relation (IT 1 in Lemma [2] implies that the amplitude of the first or of the last 0(5~ 1 / 2 2C L ~#'- B ) 
thresholds grow faster than T = B(V (3B) for < /3 < 1, there exists necessarily a Bq ^ such that 
-T{B) sC t k{€} sC < T(B) for all B ^ B and all 1 ^ i < M. 

Writing W p = diag(u>(p)), we can use the equivalence || • ||oo ^ || • \\ p ^ M l ' p \\ ■ ||oo and the squeeze 
theorem on the following limit: 

lim ||*a;* - Q P [y]\\ pMp) = lim || W p (<Z>x* - Q p [y])\\ p = lim \\W p [$x* - Q p [y}) IU- 

Moreover, since for B ^ Bq and for all 1 ^ i ^ M the bin T^jt^ is finite, the limit 

lim g'iQpiyi^-^K&x^-QplyiW 
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exists and is finite. Therefore, from the continuity of the max function applied on the M components of 
vectors in R M , we find 

lim II*** - Q p [y]\\ pMp) = lim max Q' (QM)^ 2 ^ I " QvM\ 

= max lim G' {Q P [yi}) ip ~ 2)/p \(®x*)i - Q p [yi]\ 

i p— >oo ' 

= maxg , (Q 00 (y l ))|(*a ; *) i -Qoo(y t )|. 

For 5 ^ B , (flO|) provides Q'{Qoo{yi)) —B so that, if we impose ]irn p _ ) . 00 ||*a5*-Q p [y]|| P)t „( p ) ^ 
£qc = a/2, we get asymptotically in B 

mflx^M(*aj*)i-Qoo(l/i)| < 3, 
which is equivalent to imposing (<t?a;*)j G T^Mi)* J - e -> tne Quantization Constraint. ■ 

III. Weighted £p Fidelities in Compressed Sensing and General Reconstruction 

Guarantees 

The last section has provided us some weighted £ P;W constraints, with appropriate weights w, that can 
be used for stabilizing the reconstruction of a signal observed through the quantized sensing model ([TJ. 
We now turn to studying the stability of l\ -based decoders integrating these weighted £ p ^-constraints as 
data fidelity. We will highlight also the requirements that the sensing matrix must fulfill to ensure this 
stability. We then then apply this general stability result to additive heteroscedastic GGD noise, where 



weighing can be view as a variance stabilization transform. Section IV will later instantiate the outcome 
of this section to the particular case of QCS. 



A. Generalized Basis Pursuit DeNoise 

Given some positive weights w £ M M and p ^ 2, we study the following general minimization 
program, coined General Basis Pursuit DeNoise (GBPDN), 

A PjW (y,&,e) = Argmin||«||i s.t. ||y- ®u\\ PiW ^ e, (GBPDN(^, jt0 )) 

where ||-|| PjU , is the weighted £ p -norm defined in the previous section. Note that BPDN is special case 
of GBPDN corresponding to p = 2 and w = 1. The Basis Pursuit DeQuantizers (BPDQ) introduced in 
[|8| are associated to p ^ 1 and w = 1, while the case p = 1 and w = 1 has also been covered in EDI . 

We are going to see that the stability of GBPDN(£ Pil „) is guaranteed if <& satisfies a particular instance 
of the following general isometry property. 

Definition 1. Given two normed spaces X = (R M , || • H*) and y = (R N , || • \\y) (with M < N), a matrix 
€ R MxN satisfies the Restricted Isometry Property from X to y at order K £ N, radius ^ 5 < 1 
and for a normalization p > 0, if for all x G S^, 

(1 - 5) 1/K \\x\\y ±\\$X\\ X < (1 + \\ X \\y, (14) 

K being an exponent function of the geometries of X,y. To lighten notation, we will write that <1? is 
RIP X ,y(K,6,p). 
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We may notice that the common RIP is equivalent td4 RIP^w (K, 5, 1) with k = 1, while the RIP p , g 
introduced earlier in [R| is equivalent to RIP^m £n (K, o, p) with k = q and p depending only on M, 
p and q. Moreover, the RIPp^^' defined in fnf is equivalent to the RIP^ju £n (K, 5, p) with k = 1, 
5' = 25/(1 — 6) and p = 1/(1 — 5). Finally, the Restricted p-Isometry Property proposed in ll22l is also 
equivalent to the RIP (K, 5, 1) with k = p. 

In order to study the behavior of the GBPDN program, we are interested in the embedding induced 
by $ in ([14]) of y = t% into the normed space X = £*f w = (R M , || . \\ p>w ), i.e., we consider the 
RIP^m jn property that we write in the following as RIPp^. The following theorem establishes that 
GBPDN provides stable recovery from distorted measurements, if the RIP p ii; holds. 

Theorem 1. Let K > 0, 2 ^ p < oo and * G R MxN be a RlP PyW (s, 5 s ,fi) matrix for s G {K, 2K, 3K} 
such that 

5 2K + V(1 + 6k)(5 2 k + 5 3K )(p-1) < 1/3 • (15) 

Then, for any signal x G M> N observed according to the noisy sensing model y = &x+e with \\s\\ PjW ^ e » 
the unique solution x* = A PtW (y,$>,e) obeys 

\\x* -x\\ ^ 4e (K) + 8e//i, (16) 

where eo(K) = K~* \\x — Xk\\i is the K-term ^-approximation error. 

Proof: If 3> is RIP P)U; (s, 8 S , p,) for s G {K, 2K, 3K}, then, by definition of the weighted £ p ^-norm, 
diag(w)$ is BJPim £N (s, 5 S , p). Since A PjW (y, e) = A p (diag(«;)y , diag(it>)<&, e), the stability results 
proved in Theorem 2] for GBPDN(f p ) f\ shows that 

s$ A p e (K) +B p f i , 

with A p = 2{ ^ 2 p -_cf , B p = t-s^-c p and C p < V(! + ^)(^ + ^)(p-l) ©■ It is easy to see 
that if ([15]) holds, then A p ^ 4 and £ p ^ 8. ■ 

As we shall see shortly, this theorem may be used to characterize the impact of measurement corruption 



due to both additive heteroscedastic GGD noise (Section III-C I as well as those induced by a non-uniform 
scalar quantization (Section[iV]). Before detailing these two sensing scenarios, we first address the question 
of designing matrices satisfying the RIP p >w for 2 ^ p < oo. 

B. Weighted Isometric Mappings 

We will describe a random matrix construction that will satisfy the RIP Pil „ for 1 ^ p < oo. To quantify 
when this is possible, we introduce some properties on the positive weights w. 

Definition 2. A weight generator W is a process (random or deterministic) that associates to M G N a 
weight vector w = W(M) G M, M . This process is said to be of Converging Moments (CM) if for p ^ 1 
and all M ^ Mo for a certain Mq > 0, 

pf D ^ M-^\mM)\\ P ^ p™ x , (17) 



where p™ 111 > and > are, respectively, the largest and the smallest values such that ( |17| ) holds. 
In other words, a CM generator W is such that \\W(M)\\ P = O(M). By extension, we say that the 
weighting vector w has the CM property, if it is generated by some CM weight generator W. 



'Assuming the columns of 4> are normalized to unit-norm. 
4 Dubbed BPDQ in (8). 



11 



The CM property can be ensured if liniA/^oo M~ x l p \\w\\ p exists, bounded and nonzero. It is also 
ensured if the weights {^iji^i^M are taken (with repetition) from a finite set of positive values. More 
generally, if {wi : 1 ^ i ^ M} are iid random variables, we have M~ l \\w\\ p = E|u>i| p almost surely 
by the SLLN. Notice finally that p™ ax ^ HHloo = P™ ax since \\w\\% ^ M||w||§o, and p™ in ^ minj \ wi\. 

For a weighting vector w having the CM property, we define also its weighting dynamic at moment 
p as the ratio 

We will see later that 6 P directly influences the number of measurements required to guarantee the 
existence of RIP P :W random Gaussian matrices. 

Given a weight vector w, the following lemma (proved in Appendix [E]) characterizes the expectation 
of the £ p m-norm of a random Gaussian vector. 

Lemma 5 (Gaussian £ PjW -Norm Expectation). If £ ~ A/" M (0, 1) and if the weights w have the CM 
property, then, for 1 ^ p < oo and Z ~ A/"(0, 1), 

(i + 2 p+i ^m- i )^ 1 (eii^j? < eu\\ p , w ^ (E\\a p P ,j" = (mn l/p \n\p- 

In particular, E[|£[| p ,„ ~ M v v \\w\\ p ^ v p M x l v p™ in , with v v p := E|Z| P = 2P/ 2 7r" 1 / 2 r(£± 1 )- 

With an appropriate modification of El Proposition 1], we can now prove the existence of random 
Gaussian RIP P :W matrices (see Appendix |F]). 

Proposition 1 (RIP P ,™ Matrix Existence). Let * ~ M MxN (0, 1) and some CM weights w £ R . 
Given p ^ 1 and ^ rj < 1, then there exists a constant c > such that $ is RIP pA1J (K,5, [i) with 
probability higher than 1 — r] when we have jointly M ^ 2 (28 p ) p , and 

M 2/max(2, P ) > C ^ 2 p log[ef (1 + 125" 1 )] + log ^) . (18) 

Moreover, the value fi = ^(i¥ w ,£2 ) m ( 14 1 /s g/ven by \i = E||^|| Pilu /or a random vector £ ~ N M (0, 1). 
The RIP normalizing constant fj, can be bounded owing to Lemma [5] 

Remark 2. /jgft/ of Proposition [7] assumption ( |15| ) becomes reasonable since following the simple 

argument presented in fi8, Appendix B] the saturation of requirement ( |18| ) implies that 5k decays as 
0(y/K log M /M l l p ) for RIP pw Gaussian matrices. Therefore, for any value p, it is always possible to 
find a M such that ( |15[ ) holds. However, this is only possible for high oversampling situation, i.e., for 
n((K log N/K) p / 2 ) measurements. 



C. GBPDN stabilizes Heteroscedastic GGD Noise 
Consider the following general signal sensing model 

y = $x + s , (19) 

where e G R is the noise vector. For heteroscedastic GGD noise, each E{ follows a zero-mean 
GGD(0,aj,p) distribution with pdf oc exp(— \t/a>i\ p ), where p > is the shape parameter (the same for 
all £j's), and ai > the scale parameter [23]. It is obvious that 

Ee = and E(ee T ) = T(3/p)(T(l/p))- 1 diag(a 2 , • • • , o? M ). 
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If one sets the weights to wi = 1/aj in GBPDN(^ p u ,), it can be seen that the associated constraint 
corresponds precisely to the negative log-likelihood of the joint pdf of e. As detailed below, introducing 
these non-uniform weights W{ leads to a reduction in the error of the reconstructed signal, relative to using 
constant weights. Without loss of generality, we here restrict our analysis to strictly i'T-sparse x G Y*k, 
and assume knowledge of bounds (estimators) for the £ p and the £ p>w norms used for characterizing e, 
i.e., we know that ||e|| p ~a/ e and ||e[| pi0 ~a/ £st for some e,e st > to be detailed later. 

In this case, if the random matrix $ ~ M MxN (0, 1) is RIP P)U , {K , S, p) for p ^ 2, with p = E\\£\\ p 
for £ ~ Af M (0, 1), Theorem [j] asserts that 

1 1 ac* — x\\ ^ B p e/p, 

for x* = A p i(y, e) and B p ~a/ 8. Conversely, for the weights to wi = l/c^, and assuming $ being 
RlP PtW (K,5',fi st ) with ^ st = E||£|| Pi . l0 , we get 

\\ x lt ~ x \\ < B' p e st /[i s t, 

for x* t = Ap )MJ (i/, e) and 5 p ~ M 8. 

When the number of measurements M is large, using classical GGD absolute moments formula, the 
two bounds e and e st can be set close to e p ~m Ei E|ej| p = ||a|| p /p and e^ t ~ m E; E l £ i| p = M/p. 
Moreover, using Lemmag] pP ~ A f Ei E l&l p = ME|.Z| P and /x p ~ Af E|Z| P \\w\\f,, where 2 ~A/"(0,1). 

Proposition 2. For an additive heteroscedastic noise e G IR A/ smc/j that £j GGD(0,Qj,p), setting 
Wi = 1/aj provides e P t /V P t ~m e p /p p . Therefore, asymptotically in M, GBPDN(£ P)W ) has a smaller 
reconstruction error compared to GBPDN(£ P ) when estimating x from the sensing model ( |19[ ). 

Proof: Let us observe that 4t/l4t ~ M M(pE|Z| p HIJ) -1 = (pE^)- 1 ^ £) 4 j?)" 1 - B Y the 
Jensen inequality, (i £\ ^ ^ ± £\ «f> so that ^/^ < M \ (E\Z\ p )- l \\ a \\ p p /M = e p /p p . ■ 

The price to pay for this stabilization is an increase of the weighting dynamic 9 P = (%nr) 2 defined in 
Proposition [1] which implies an increase in the number of measurements M needed to ensure that the 
RIPp,iu (K, S, p) is satisfied. 

Example. Let us consider a simple situation where the oti's take only two values, i.e., G {1,H} for 
some H ^ 1. Lef assume also that the proportion of on's equal to H converges to r G [0, 1] with M 
as '■ oti = H} — r\ = 0{M~ l ). In this case, the stabilizing weights are Wi = X/oti G {1, 1/H}. 

An easy computation provides 

E: =g ~ ^ P p (rH p + (l-r)), 

E - : =S 5 l^(rH- p + (l-r))-\ 

so that, the "stabilization gain" with respect to an unstabilized setting can be quantified by the ratio 

(-r-)p — (rH~ p + (l — r))*(rH p + (l — r))* ~ (r(l-r))'H. 

We see that the stabilization provides a clear gain which increases as the measurements get very unevenly 
corrupted, i.e., when H is large. Interestingly, the higher p is, the less sensitive is this gain to r. We also 
observe that the overhead in the number of measurements between the stabilized and the unstabilized 
situations is related to 

p v Pp ' M V ' M.H 
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The limit case where H 3> 1 can be interpreted as ignoring r percent of the measurements in the data 
fidelity constraint, keeping only those for which the noise is not dominating. In that case, the sufficient 
condition ([18]) in Proposition 1 for $ to be RIP P , W tends to 8p~ p/2 M = (l-r)M = il((K log N/K) p l 2 ) 
which is consistent with the fact that on average only fraction 1 — r of the M measurements significantly 
participate to the CS scheme, i.e., M' = (1 — r)M must satisfy the common RIP requirement. For p = 2, 
this is somehow related to the democratic property of RIP matrices [4], i.e., the fact that a reasonable 
number of rows can be discarded from a matrix while preserving the RIP. This property was successfully 
used for discarding saturated CS measurements in the case of a limited dynamic quantizer 

IV. Dequantizing with Generalized Basis Pursuit DeNoise 

Let us now instantiate the use of GBPDN to the reconstruction of signals in the QCS scenario defined 
in SectionTTJ Under the quantization formalism defined in Lemma [3] and for Gaussian matrices 3>, the 
factor e//j, in ( fT6] ) can be shown to decrease as l/\/p + 1 asymptotically in M and B. This asymptotic 
and almost sure result which relies on the SLLN (see Appendix [G]) suggests increasing p to the highest 
value allowed by ( fT3] ) in order to decrease the GBPDN reconstruction error. 

Proposition 3 (Dequantizing Reconstruction Error). Given x e M. N and 3? ~ J\f MxN (0, 1), assume 
that the entries of z = &x are iid realizations from Z ~ AA(0,Oq). We take the corresponding optimal 
compressor function Q defined in Q and the p-optimal B-bits scalar quantizer Q p as defined in ([8J. 
Then, the ratio e//i given in ( |16| ) is asymptotically and almost surely bounded by 

e < J -B P+l 2 " ^ I _± 

with d = (9/8)(evr/3) 1 / 2 . 

Notice that, under HRA and for large M, it is possible to provide a rough estimation of the weight- 
ing dynamic 6 P when the weights are those provided by the D p C constraints. Indeed, since Wi(p) = 
G'{Q T M) {p - 2)/p and Q> = 70^, we find 

HI£ = X)0'(Q P [w]) p - 2 -m Mj2G' p ~ 2 Mp k 

i k 

—B,M M{2^al)^l\2,al)-^^k «p(-|<^) 

k 

—B,M M (2vr3a 2 )( 2 -rt/ 2 (2^ 2 )- 1 /2 (27r M)i/2 
= M (27ra 2 )( 2 - p )/ 2 3( 3 - p )/ 2 (p + l)" 1 / 2 , 
where we recall that pk = ipo(t)dt ~b ( /'o(c / )Tfc, for any d G TZf. (see the proof of Lemma |9j). 



Moreover, using (10 1 and since one of the two smallest quantization bins is = [0) r s/2)> 

ll«IISo (<x/tb/2) p - 2 = {oi/Q-\l/2 + a)) p - 2 —b (2vr3a 2 )( 2 -P)/ 2 . 
Therefore, estimating Of, with M 2 || w || o^/ 1| w \\p P , we find 

e p/2 —b,m V(p + i)/3. 



Therefore, at a given p ^ 2, since (18 1 involves that M evolves like Q(6^ (K \ogN / K) p / 2 ), using 
the weighting induced by GBPDN(^ Pjl0 ) requires collecting y/(p+ l)/3 times more measurements than 
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GBPDN(i'p) in order to ensure the appropriate RIP p i(; property. This represents part of the price to pay 
for guaranteeing bounded reconstruction error by adapting to non-uniform quantization. 

Dequantizing is Stabilizing Quantization Distortion: 



In connection with the procedure developed in Section III-C the weights and the p-optimal levels 
introduced in Lemma [3] can be interpreted as a "stabilization" of the quantization distortion seen as 
a heteroscedastic noise. This means that, asymptotically in M, selecting these weights and levels, all 
quantization regions IZt contribute equally to the £ pw distortion measure. 

To understand this fact, we start by studying the following relation shown in the proof of Lemma [3] 
(see Appendix |D]|: 

\\Qp[z]-z\\ P P , w ~ ^E^'Wr 2 / \t-co k jr<p (t)dt. (20) 

Using the threshold T(B) = G(VB) and T = [-T(B),T(B)) as defined in Lemma [2) the proof of 
Lemma [9] in Appendix [D] shows that 



\Qp[z\-*\\ P p >w -? n M £ [G'Mr 2 ( li-u^PVo^di, (21) 

k:iz k cr Kk 

M IG'Mr'j^M^p), (22) 



B 



k:K k CT 

V — ,„V 3 /III, n„ III 1 

/3' 



using ([9]). However, using ( [TO] ) and the relation Q' = ^c/Vll^ol M3, we find t| ipo(tUk,p) — b a 3 III III 1/3- 



Therefore, each term of the sum in ( |2T| ) provides a contribution 

[G'i^pW^j^j^M^p) - IIVolll/3 



which is independent of fc! This phenomenon is well known for p = 2 and may actually serve for defining 
Q' itself J5J. The fact that this effect is preserved for p ^ 2 is a surprise for us. 



V. Numerical Experiments 

We first describe how to numerically solve the GBPDN optimization problem using a primal-dual 
convex optimization scheme, then illustrate the use of GBPDN for stabilizing heteroscedastic Gaussian 
noise on the CS measurements. Finally, we apply GBPDN for reconstructing signals in the quantized CS 
scenario described in Section Q]] 



A. Solving GBPDN 

The optimization problem GBPDN(^ p/u; ) is a special instance of the general form 

min f(u) + g(Lu) , (23) 

where / and g are closed convex functions that are not infinite everywhere {i.e., proper functions), and 
L = diag(iu)<l? is a bounded linear operator, with f{u) := \\u\\\, and g(v) := i^{v — y) where iw p (v) 
is the indicator function of the £ p -ball W p centered at zero and of radius e, i.e., i®e(v) = if v G Bp and 
+00 otherwise. For the case of GBPDN(£p )1i) ), both / and g are non-smooth but the associated proximity 
operators (to be defined shortly) can be computed easily. This will allow to minimize the GBPDN(^p jt0 ) 
objective by calling on proximal splitting algorithms. 
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Before delving into the details of the minimization splitting algorithm, we recall some results from 
convex analysis. The proximity operator [24] of a proper closed convex / is defined as the unique solution 

proxj(u) = argmin — u\\ 2 + f(z). 

z 

Hf f = tc f° r some closed convex set C, proxj is equivalent to the orthogonal projector onto C, proj c . 
/* is the Legendre-Fenchel conjugate of /. For A > 0, the proximity operator of A/* can be deduced 
from that of //A through Moreau's identity 

proxw»(tt) = u — A prox A -ij(u/A) . 

Solving ( |23T ) with an arbitrary bounded linear operator L can be achieved using primal-dual methods 
motivated by the classical Kuhn- Tucker theory. Starting from methods to solve saddle function problems 
such as the Arrow-Hurwicz method 11251 . this problem has received a lot of attention recently, e.g., (26]- 
[28l . In this paper, we use the relaxed Arrow-Hurwicz algorithm as revitalized recently in [27!]. Adapted 
to our problem, its steps are summarized in Algorithm [T] 

Algorithm 1 Primal-dual scheme for solving GBPDN(£ Pjl0 ). 
Inputs: Measurements y, sensing matrix <&, weights w. 

Parameters: Iteration number A^t er , G [0, 1], step-sizes a > and r > with rcrHioH^ ||3>|| 2 < 1. 

Main iteration: 

for k = to A^tcr — 1 do 

• Update the dual variable: 

v k+ i = prox ag ,(v k + aLu k ) . 

• Update the primal variable: 

u k+ i = prox r/ (u fc - rL T v k+1 ) . 

• Approximate extragradient step: 

u k +i = u k+ i + 6(u k+ i - u k ) . 

Output: Signal uw itor - 



A sufficient condition for the sequences of Algorithm [T] to converge is to choose a and r such that 
rcrllioll^ll^H 2 < 1. It has been shown in E71 Theorem 1] that under this condition and for 9 = 1, the 
primal sequence (u k ) k£ ^ converges to a (possibly strict) global minimizer of GBPDN(^ p u; ), with the 
rate 0(1/ k) in ergodic sense on the partial duality gap. 

Proximity operator of f: For/(tt) = ||w||i, prox r j(it) is the popular component- wise soft-thresholding 
of u with threshold r. 

Proximity operator of g: Recall that g(v) = tB*(v—y). Using Moreau's identity above, and proximal 
calculus rules for translation and scaling, we have 

prox^, (v) = v - ay - proj JB?£ (v - ay) . 

It remains to compute the orthogonal projection proj B i to get proj B ^ e = creproj B i(-/(<7e)). For p = 2 
and p = +oo, this projector has an easy closed form. For 2 < p < +oo, we used the Newton method we 
proposed in |[8l for solving the related Karush-Kuhn-Tucker system which is reminiscent of the strategy 
underlying sequential quadratic programming. 
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B. Gaussian Noise Stabilization Illustration 



We explore numerically the impact of using non-uniform weights (e.g., stabilizing the measurement 
noise) for signal reconstruction when the CS measurements are corrupted by heteroscedastic Gaussian 



noise, as discussed in Section III-C This illustrates for p = 2 both the gain induced by stabilizing the 
sensing noise and the increase of measurements necessary for observing this gain. 

In this illustration, we set the problem dimensions to N = 1024, K = 16, and let the oversampling 
factor be in M/K £ {5, 10, ••• ,50}. The i\~-sparse unit norm signals were generated independently 
according to a Bernoulli-Gaussian mixture model with K-length support picked uniformly at random in 
[N], and the non-zero signal entries drawn from A^(0,cr^) with of ~ 1/K. Noisy measurements were 
simulated by setting y = <&x + e, with e$ jV(0, of) and $ ~ Af MxN (0, 1). The heteroscedastic 
behavior of s has been designed so that a\ ~iid U([o~q — 8q,ctq + So]) with o"o = 0.1 and 8q = 0.6 gq. 

Two reconstruction methods were tested: one with and the other without stabilizing the noise variance. 
In the first case, the weights have been set to Wi = l/o~i, while in the second w = 1. Since the purpose 
of this analysis is not focused on the design of efficient noise power estimators, e and e st have been 
simply set by an oracle to e st = \\y — &x\\2 :W and e = \\y — 4?ac||2. 

Given the parameters above, we compute the weighting dynamic 9 P ~ A / ^Eyl^!" = al-tl = ^' an< ^ 
the average stabilization gain should be (see Proposition |2]) 

201og 10 ||aj-a!*||/||x-aj^|| ~ A / 201og 10 (e||i«||)/(e st v / M) < 2.43dB. 
Num erical ly, GBPDN(^ 2 ,u,) and GBPDN(^) = BPDN have been solved with the method described in 



Section IV-Bl until the relative ^-change in the iterates was smaller than 10 6 (with a maximum of 2000 



iterations). Reconstruction results were averaged over 50 experiments. In Fig. 2(a) the reconstruction 



signal-to-noise ratio (SNR) of the stabilized reconstruction is clearly superior to the unstabilized one and 



this gain increases with increasing oversampling ratio M/K. This SNR gain is displayed in Fig. 2(b) The 
dashed horizontal line represents the theoretical prediction of 2.43 dB which turns to be an upper-bound 
on the numerically observed gain. 




Fig. 2: Stabilized versus unstabilized reconstruction using GBPDN(^2,™) and BPDN respectively, (a) The reconstruction SNR 
using stabilized (triangles) and unstabilized (squares) methods, (b) Observed (triangles) and theoretically predicted (dashed) SNR 
gain at 2.43 dB brought by stabilization. 
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C. Non-Uniform Quantization 



We describe several simulations challenging the power of GBPDN for reconstructing sparse signals 
from non-uniformly quantized measurements when the weights and the p-optimal levels of Lemma [3] are 
combined. Several configurations have been tested for different p ^ 2, oversampling ratio M/K, number 
of bits B and for non-uniform and uniform quantization. 

For this experiment, we set the key dimensions to N = 1024, K = 16, B = 4, and the if-sparse 
unit norm signals have been generated as in the previous section. The oversampling ratio was taken 
as M/K € {10, 15, • • • ,45}, p G {2,4, •• • , 10} and the matrix has been drawn randomly as $ ~ 
J\f MxN (0, 1). The non-uniform quantization of the measurements <&x was defined with a compressor Q 
associated to 7o )(Jo according to ([3]). The weights w were computed as in Lemma [3j and the p-optimal 
levels using the numerical method described in Appendix [H] 

For the sake of completeness, we also compared some results to those obtained for a uniformly 
quantized CS scenario. In this case, the measurements z = &x are quantized as j/j = a!\_Zi/od\ + ct'/2, 
the quantization bin width a' = <x!{B) has been set by dividing regularly the interval [— ||z||oo, Halloo] 
into the same number of bins as those used for the non-uniform quantization. 

Again, GBPDN was solved with the primal-dual scheme described in Section V-B until either the 
relative ^-change in iterates was smaller than 10~ 6 or a maximum number of iterations of 2000 was 
reached. Finally, all the reconstruction results were averaged over 50 replications of sparse signals for 
each combination of parameters. 



Fig. 3(a) displays the evolution of the signal reconstruction quality, as measured by the SNR, as a 
function of the oversampling factor M/K. We clearly see a reconstruction quality improvement with 
respect to both the uniformly quantized CS scheme (dashed curve) and to increasing values of p and 
M/K. This last effect is better analyzed in Fig. 3(b) where the SNR gain with respect to p = 2 for 
various values of p is shown. As predicted by Proposition [3] we clearly see that, as soon as the ratio 
M/K is large, taking higher p value leads to a higher reconstruction quality than the one obtained for 
p = 2 (BPDN). Moreover, Fig. 3(b) confirms that when p increases, the minimal measurement number 
inducing a positive SNR gain increases. For instance, to achieve a positive gain at p = 4, we must have 
M/K ^ 15, while at p = 10, M/K must be higher than 20. At p fixed, the reconstruction quality 
increased also monotonically with M/K. 

We observe that, given the oversampling ratio, these experimental results allow to increase p to a greater 
extent than would be allowed by our theory deployed in Section IV In particular, the sufficient condition 
( 18 1 dictated by Proposition [I] requires the number of measurements M to scale as K p / 2 (ignoring times 
the usual logarithmic terms) in order to ensure the RIP Pil „. This would imply an exponential increase 
in the number of measurements needed as p increases. However, from Fig. 3(b)| one can see that for 
M/K = 15, p = 4 was the largest value before performance starts degrading. With M/K = 20, p could 
be increased to 6 before degradation, and to 8 before degradation with M/K = 30. At least for this 
example, we do not observe such a severe exponential dependence in the needed oversampling in order 
to benefit from error decrease when increasing p. 

In Fig. |4j the quantization consistency of the reconstructed signals is tested by looking at the histogram 
of ot~ l (G(&x*) — Q{y)). We do observe that this histogram is closer to a uniform distribution for p = 10 
than for p = 2, in good agreement with the "companded" quantizer definition Q = Q~ l o Q a o Q showing 
that in the domain compressed by Q, this quantizer is similar to a uniform one. 

As a last test, we have more thoroughly compared a uniform quantization scenario described in the 
experimental setup above with the BPDQ P decoder developed in [8] to the non-uniform case studied in 
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(a) (b) 

Fig. 4: Testing the Quantization Consistency (QC). (a) Histogram of the components of a -1 (G(&x*) — G(y)) for p = 2 and 
M/K = 40 (averaged over 100 trials), (b) Same histogram for p — 10. The QC is better respected in this case. 

this paper. More precisely, Fig. [5] shows the reconstruction SNR gain between non-uniform and uniform 
quantization at various p, i.e., SNR(GBPDN(^ P)lu )) — SNR(BPDQ P ). We see that, at a given p, this gain 
improves with M/K, and the highest SNR improvement values are obtained for p = 2. This points the 
fact that for p ^ 2, the quantization scheme is not optimized for reducing the £ P:W -noTm distortion. This 
would require us to change the quantization scenario by not only optimizing the p-optimal levels but also 
the thresholds. This will be be left to a future research. 

VI. Conclusion 

In this paper, we have shown that, when the compressive measurements of a sparse or compressible 
signal are non-uniformly quantized, there is a clear interest in modifying the reconstruction procedure by 
adapting the way it imposes the reconstructed signal to "match" the observed data. In particular, we have 
proved that in an oversampled scenario, replacing the common BPDN ^-norm constraint by a weighted 
£ p -norm adjusted to the non-uniform nature of the quantizer reduces the reconstruction error by a factor 
of y/p + 1. Moreover, we showed that this improvement stems from a stabilization of the quantization 
distortion seen as an additive heteroscedastic GGD noise on the measurements. 
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Fig. 5: Reconstruction gain (in dB) between non-uniform or uniform quantization at the same p. 



In future work, we will investigate if the quantization scheme can also be optimized with respect 
to the proposed reconstruction procedure, i.e., by adjusting the thresholds for minimizing the weighted 
^p-distortion at a fixed bit budget. 

Appendix A 
Preparatory Lemmata 

This appendix contains several key lemmata that are useful for the subsequent proofs developed in the 
other appendices. 

The first lemma will serve later to evaluate asymptotically the contribution of each quantization bin 
to the global quantizer distortion measured with £ PjU ,-norm when a Gaussian source (with pdf <po) is 
quantized. 

Lemma 6. Given a,b £ M with a < b, n G N \ {0} and a Gaussian pdf cpo = jo. aa - Let X n be the 
(unique) minimizer o/mm^ 6 r a6 i \t — X\ n (fo(t) dt. Then, 

' \t - A n |™ Mt) dt > (1 + C, (24) 

f \t - X n \ n Mt) dt < (^Pr (1 + (^ n+ ^ n ) D, (25) 

J a 

j^iS^a + b) ^ X n ^ TT ^(a + S 1 H), (26) 

with C := min te [ a ^ tp (t), D := max te [ a ,b] <Po(t) and S = D/C. 

Proof: Let us first show the upper bound ( |25] >- In Lemma [Tj and its proof, it was show that X n exists 
and is unique, i.e., the minimization problem is well-posed. Furthermore, A n satisfies 

A„ r-b 

{X n - t)"" 1 Mt) dt = J^(t- X n ) n - 1 dt. 

Since Mt) e t c > D ] for t G [a, b], we have (A„ - a) n C sC nA < (A„ - a) n D and (b - X n ) n C < nA sC 
i(6- A n ) n D. This implies (A„ - a) n ^ (§) (6 - X n ) n and (b - X n ) n > (§) (A„ - a) n , from which we 
easily deduce d26l). 
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Since £\t - X n \ n <p (t)dt = J^(X n - t) n <po(t)dt + (t - X n ) n <p Q {t)dt, we find f*\t - 
X n \ n <p (t) dt ^ ^ [(A n - a) n+1 + (b- X n ) n+1 ] D. From ((A n -a)/(b - X n )) n G [C/D, D/C], we find 
that \ t — X n \ n tpo(t) dt is smaller than 

^ min ( (A n - aT+\ (b - A n )" +1 ) [l + (g) 



D. 



This provides (25 1 since min(A n — a,b — X n ) ^ (b — a)/2. The bound (24 1 is obtained similarly. ■ 

The following lemma presents a generalization of "Q-function like" bounds for lower partial moments 
of a Gaussian pdf. 

Lemma 7. Let X > 0, n G N awe? 93 = 70,1- Let us define Q n {X) := f^°° (t — X) n tp(t) dt. Then, 
Qn(X) = e(A-(" +1 V(A)). More precisely, n gff| fc) <p(X) < Q„(A) < ^r^(A). 

This lemma generalizes the well known bound on Q = Qq, namely f(X) ^ Q(X) ^ j f(X). 
Proof: The proof involves integration by parts, the identities —tp'(u) = utp(u) and ((p(u)/u n )' = 
(1 + 4)4^- Therefore, the upper bound is a simple consequence of 

r+00 

Qn(X) < i jf (t - A)" Mt) dt = 5 Q n _a(A) < ^ ^Q(A) ^ ^(A). 

To get the lower bound, observe first that, defining Q n ,k{X) := f^°° (t — X) n t~ k ip(t) dt, we find 

r+00 

(l + ^)Qn,fc(A) > J (t-X) n (l + ^)r k <p(t) dt = nQ n _ lifc+1 (A). 



Therefore, Q n (X) > jfc Q n - hl (X) > ■ ■ n^fc+fc) Qo,n(A). But (1 + 2±±) Q ,»(A) ^ ¥>(A)/A"+\ 

n!A 2 "+ 2 g(A) 



so that Q n (X) ^ f^Sttwt l.i t^ft> which concludes the proof. 

rii._i (a a 



Appendix B 

Proof of Lemma [TJ "^-optimal Level Definiteness" 

Proof: For 2 ^ j> < 00, \t— X\ p is a continuous, coercive and strictly convex function of A over M, and 
therefore so is \t — X\ p ipo(t) dt since tpo(t) > 0. It follows that the function \ t — X\ p <po(t) dt has a 
unique minimizer on M. Moreover, this minimizer is necessarily located in TZk since \t — X\ p <pa(t) dt 
is monotonically decreasing (resp. increasing) on (— 00, tk) (resp. (£&+!, Consequently, uik,n exists 

and is unique. 

For proving the limit case p — > 00, for finite bins TZf. (k £ {1, B}) and without loss of generality for 
tk ^ 0, relation (26 1 in Lemma [6] with a = tk and b = tfc+i, together with the squeeze theorem shows 



that 



lim u k ,p = lim , l 1/p (S 1/p t k + t k +i) = Hm , * (t fc + S 1/p t fc+ i) = oj fc ,oo , 

p— »+00 p— > + 00 TJ p— > + 00 ' J 



where S = <po(*fc)/Vo(*ft+i). 

For infinite bins A; G {1, and assuming again tk ^ 0, it follows from the beginning of the proof 
that ujk, p is the unique root on [tk, +00) of £ P (X) := f t (A — t) p ~ 1 99o(i) dt — J^°(t — A) p_1 v?o(i) dt. Let 
Wfc, P G [tfc, L] be the root of £ P (X, L) := /^(A-t) p ~ 1 ^ (*) dt-/ A L (t-A) p " Vo(i) dt for some L ^ t fc . We 

5 Where we used the Lebesgue dominated convergence theorem to interchange the integration and derivation signs. 
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then have 8 p {Co Kp ) = /^(w^-^VoC*) dt- f£jt-u> kjt )P- Vo(*) dt = - /~(*-w fc)P )P- Vo(*) < 
= £ p (uJk,p)> which implies u>fc jP ^ cjfc, P since £ p is non-decreasing for p ^ 1. However, since uj k)P is 
optimal on [t^, L], taking L = L(p) = c^/p, for c > 0, we have by Lemma [6] with a = t k and b = L(p), 
lim p ^ +00 u) k , p ^ limp^ +00 1+ g 1/p (S 1/p t fc + Cy/p) = +oo since S 1/p = exp(-t 2 k /2pa 2 ) exp(c 2 /2crg) = 
0(1). This proves lim p ^ +OC) = +oo = uj kj00 and = fi(- v /p) for fe G {!,£?}• ■ 



Appendix C 

Proof of Lemma[2J "Asymptotic p- Quantization Characterization" 

The content of Lemma [2] is derived from this larger set of results which constitutes a toolbox lemma 
for other developments given in these appendices. 

Lemma 8 (Extended Asymptotic p-Quantization Characterization). Given the Gaussian pdf (fo and 
its associated compressor Q function, choose < f3 < 1 and p G N, and define T = T(B) = 
a/6 <7q (log W) B, T = [— T, T] and T c = K \ T. We have the following asymptotic properties (relative 
to B): 

Q'{T(B)) = Q(2T pB ), (27) 
#{k:K k cT c } = Q{B- 1 / 2 2( 1 -^ B ), (28) 

/ \t-io k>p \ p w{t)&t = 0(B-^l 2 2~^ B ), VJl k cT c . (29) 

Moreover, for all k such that lZ k C T and any c G T^fc 

x k :=t k+l -t k = 0(2-( 1 ~^ B ), (30) 

1 < rfcg:te(t + oj = ex p (°^ 1/2 2 ^ (1 " /3)B )) = 1 + °^ 1/2 2_(1 ^ )B )' (31) 

J nh \t-u) k)P \P <p (t)dt ~b (p+i )2 p ^o(c), (32) 
g'(c)~ B f. (33) 

Finally, if k is such that T(B) G 7£fo f/ien, writing the interval length/measure C(A) = dtfor A CM, 

C(R k nT) = 0(2- ( - 1 -^ B ), (34) 
g'(co k , p )^m^(g'(t k ),g'(t k+1 ))= 0(2^ B ), (35) 

\t-Uk, P \ p Mt)dt =0{B-^/ 2 2-^ B ). (36) 



Proof: In this proof we use the quantizer symmetry to restrict the analysis to the half (positive) real 
line M+, on which tpo is decreasing. 

Relation ( |27| ) comes from the definition of T(B) and that of Q' = 7 Q ^m ao - For proving p8] ), we can 

observe that £(A) = ||| <^ III ¥>c/ 3 (*) dt = 1 - Q(A/\/3cr ) where Q(i) = ^= f t + °° 70,1 (m) du. 

Since 70,1 (A) ^ Q(A) ^ ^70,1 (A), we obtain 

aff^a'CA) < i-g(\) ^ 3 4g'(\). 

Taking A = T(B) in the last inequalities and using p7] l, we deduce from the quantizer definition 

#{k:K k cT c } = 2#{k:t k ^T(B)} = 2 0C 1 (1 - Q (T)) = ©(iT 1 / 2 2 ( 1 -« B ) . 



22 



Relation (29 1 is proved by noting that, if t k ^ T(B), 



\t-co k>p \ p ip Q (t)dt^ / (t - t k ) p <p Q (t) dt ^ / (t - t k f <p Q (t) dt, 
n k Jn k Jt k 

where the first inequality follows from the p-optimality of ui ktP G H-k- However, from Lemma |7j we 
know that, for A G IR + 



with Q p (A) := / A °° (t - A)p 70,1 W ^ and <7gQ„(A) = (t - A )^ (i) dt. 
Therefore, since y?o oc (<5') 3 > 

\t - t k y Mt) dt < w,(t») < w>(t) = o(B-<*-D/a 2 - 3 ^) 



Relation ( |30] > is obtained by observing that is concave on K.+. This implies Tfc ^ ot/G'(t k +i) and 



if fc is such that < t k+1 < T(B), T fc = O^""^ 1- ^). For ((31]), keeping the same k, we note that 
1 ^ = exp(g^T fc (t fc + t k+1 )) ^ exp(^x k t k+1 ) = exp {0{B 1 / 2 2~ { - l -^ B )) which is then 

arbitrarily close to 1. 



For proving p2| ), we assume first p ^ 1. Let us consider ( |24| ) and ( |25] ) with a = t k , b = t k+ \, 
C = ¥>o(*fc+i) and D = <po(t k ) with < t fc+ i < T(£). From {31]) we see that 1 < £ = 1 + o(l). We 
show easily that this involves the equivalent relations C ~5 D, C/D ~b 1 and D/C ~# 1. Therefore, 
(1 + (D/C)(p +1 )/p) ~b 2 and (1 + (C/D)(p +1 )/p) ~ b 2. Moreover, C ~ B 990(c) and D ~ B <p (c) 
for any c € 7?.jt, so that ( |24| ) and ( [25] )) show finally |t — Wfc jP | p y?o(i) di < B ^m~2i 1o( c ) an d 

J*^ |i — Wfc iP | p </?o (*) dt (p+^i) 2? ^o(c), which proves the relation. The case p = is demonstrated 
similarly by observing that ^o(t k+1 )x k p k := f Uk <p {t) dt < (f (t k )r k . 

Let's now turn to showing ([33]). From (31 1 and since oc <p^ 3 , 1 Sj Q' {t k ) / Q' {t k+ i) = l+o(l) so that 



Q'(t k )/Q'(t k +i) — b 1- By concavity of on K.+, we know that Q'(t k+ \) ^ oc/t/c Sj G'{t k ). Therefore, 
1 ^ {G' (tic+i))^ 1 ot/r k = l + o(l) which yields G'{t k+ i) ~ B <x/x k . By the concavity argument again, we 
have G'{t k ) > G'(c) > G'(t k+1 ) for any c G ft fc , and thus l+o(l) = a'(tjfc)/0'(ifc+i) ^ S'(c)M*fc+i) > 
1. This implies £'(c) ~ B ^'(tfc+i) — _b a/x fc . 

If k is such that ^ t k ^ T(2?) ^ tfe+i, using again the concavity of G on R + , we find C(lZ k CiT) = 
T(B) -t k ^ (G{T(B)) - ku)/g'(T(B)) s$ oi/G'(T{B)) = 0(2~^-^ B ), which proves Q. 

For showing <j35j>, we note that G'(t k ) = G'(T)(G'(t k )/G'(T)). Since G'(t k )/G'(T) = exp(^(T - 
t k ){T + t k )) ?S exp(3^(T - t k )T) = exp(0(B 1 / 2 2"( 1 " /3 ) B )) which is arbitrarily close to 1 (i.e., it is 
e°W), we find G'(t k ) = 0(2^ B ), i.e., it inherits the behavior of Q'(T). 

The last relation ( j36] ) is proved similarly to ( [29] ) by appealing again to Lemma [7] 



(t-t fc )^o(t)dt < / (t-t k fMt)dt^ v^ V0 (t k ) = O{B-^l 2 2-^ B ), 



where the asymptotic relation is obtained by seeing that, as soon as T — t k ^ 1/2 (which is always 
possible to meet thanks to ((34])), 

J_ - lfi_ t^\-i < Lfi ,o^=^l 



and (^o(tfc) = 0(2- 3(3B ) since <^ oc (G') 



>\3 
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Appendix D 

Proof of Lemma [3} "Asymptotic Weighted ^-Distortion" 
Before proving Lemma [3] let us show the following asymptotic equivalence. 
Lemma 9. Let p G N \ {0} and 7 > p - 3. 

EI^Kp)] 7 / l*-^., P l p ^o(t)dt — b T^gW [[Q'WMQto, (37) 



fc=l 



Proof: Let us use the threshold T(l?) defined in Lemma [8] for splitting the sum (37i in two parts, 
i.e., using the quantizer symmetry, 

J2[G'MV f \t-u kj ,\*<po(t)dL= 2 E IG'MV [ \t-u kj ,\»<po(t)to + R , 

fc=l ^ fc: 0<t fc+1 <T(B) ^ 



where the residual R reads 



= 2[g'(LJ k ,, p )F I \t-u; k ,J p cp (t)dt + 2 V [G'(oo k ,pW I \t-LO k , p \ p <p (t)dt, 



fc: t k >T(B) 



where fc' is such that < T"(5) ^ ifc'+i- 

From Lemma [8} we can easily bound this residual. We know from (27 1, (29 1, (35 1 and (36 1 that, for 

all k G { j : u jiP ^ ^ T(B)} U {k'}, 

[S'KpF I \t- u kjP f <p (t) dt = (2-^ + ^ B B-^/ 2 ). 



However, ((28]) tells us that the sum in R is made of no more than 1+0(B' 1 ^ 2 2^^ b ) = 0(#~ 1/2 2^~^ B ) 
terms, so that 

R = ^-(P+2)/2 2 -(/3( 7 +4)-l)B^ 



Let us now study the terms for which ^ t k+ \ ^ T{B). Using ( [32] ) and ([33]) provides 

B 

jzig'mp \t-u k!P \v<p (t)dt 
k =i 

r p+i 



B 



2 E [g / K,p)] 7 (p + 1)2 p M^k, P ) + R 

fe:0<t i!+1 <T(B) 

27^^ £ [e'K^f-^oH.p)^ + R 



' (p+l) 2f 



rr(B) 



7i 



2(^W / [^W] 7 "^o(i)dt + R, 



where, knowing that ^ t k+ i ^ T(B), we have also used ( [32] ) with p = to see that p k 

In k ^o(*) dt ~b Vo(c')Tfc for any c' G 7e fe . 
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Therefore, provided that ^(7 + 4) ^ p + 1, which means that 7 > p — 3 since /3 < 1, the residual R 
decreases faster than the first term in the right-hand side of last of the last equivalence relation, so that 



y~] [G'(uk, P )r / \t- tuk,p\ p vo{t) & 
fc =i J-** 



2~vu 
(p+1) 2p 



[g'(t)r-PMt)dt, 



since T(B) = 6(5 1 / 2 ) by definition. ■ 

With the three previous lemmata under our belts, we are now ready to prove Lemma [3] 

Proof of Lemma^ For Z{ A/"(0, <Tq) with pdf (po, using the SLLN applied to zi conditionally 
on each quantization bin, we have 



M 

\Qp[z) - z\\l w := [0'(Q P N)r 2 \zi - Q P [Zi]\ p 
1=1 

B 

M £ [<?'(u; fc , p )F~ 2 / |t - u; fc , p | p <po(t) dt, 



M 



k=l 



where we used implicitly the quantizer symmetry in the last relation. This last relation is characterized 
by Lemma [9] by taking n = p and 7 = p — 2 > p — 3, so that 

WW - zf p>w ^ [ [G'(t)}- 2 Mt)dt, 

M (S^l^ll/3- 



Appendix E 

Proof of LemmaJsJ "Gaussian .^^-Norm Expectation" 

First, the inequality E||£||p TO ^ (E||£|| P , ) . u ,) 1//p follows from the Jensen inequality applied on the convex 
function (-) p on R + . Second, from our result in [8, Appendix C] it is easy to show that 

m\\ P , W >{m\\ P P ,J 1/p ^ + (ElKll^r 2 Var IKIIf,™,)*" 1 - 
Moreover, E||£||p jtu = ||if||p E|Z| P , while 

Varll^H^ = ^Var|u;^| p = ||™||gVar|£| p . 

i 

Therefore, assuming CM weights, 



m\\p, w /(m\\ p P ,j 1/p > (1 + { P ^/ P f n ) 2ri M-\nz\vr 2 vzx\z\v) 



i-i 



> (1 + 2 p+1 e p p M~ 



since pSf x < p™ ax , and (E|Z|p)~ 2 Var |Z| 2p < 2 P+1 0. 
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Appendix F 

Proof of Proposition[TJ "RIP Pi „, Matrix Existence" 

The proof proceeds simply by considering the Lipschitz function F(u) = \\u\\ P:W and the expected 
value /i = F(£) for a random vector £ ~ Af M (0, 1) in (H Appendix A]. The Lipschitz constant of F is 

lim -F(u) — F(v) / llu — «l| = HiollooAp, 

with X p = max(M( 2 ~ p )/ 2p , 1) for p ^ 1. The value /i = ~E\\£\\ P . W can be estimated thanks to Lemma [5] 
Indeed, it tells us that if M ^ 2(20 p ) p , 

with 4 = E|zp = 2P/^- 1 / 2 r(2±i). 

Inserting these results in [[U Appendix A], it is easy to show that a matrix 3> ~ AA MxiV (0, 1) is 
R3Pp,w{K, 5, fx) with a probability higher than 1 — 77 if 

M 2/max(2, P ) ^ c (^) 2 (lir ]og[eg(l + 12*" X )] + log 2 ;), 

for some constant c > 0. 



Appendix G 

Proof of Proposition^ Dequantizing Reconstruction Error 

Proof: We have to bound ep/E||£[| P)l0 , with £ ~ Af A1 (0, 1), when M is large and under the HRA. 
First, according to Lemma [5J using the SLLN and using the same decomposition than in the proof of 
Lemma [5] with the threshold T(B) (with /3 = (p + l)/(p + 2)) and the bounds provided by Lemma [8j 
we find almost surely 



M 

M 



it ■.= (m\\p, w y s ^[g'(Q P N)r 2 E^r 



i=l 

M 



The sum in the last expression is characterized by Lemma [9] by setting inside (37) n = and 7 = p — 2. 
This provides 

// =- ME|Z| p / [g'(t)] p " 2 ^ (t)dt 

JR 

- ME|^[/^ 3 (t)] 2 - p [/^ +1 )/ 3 (t)dt]. 
jr Jm 

Therefore, using the value e p defined in Lemma [3j 



^ 2 -p(b+i) (p + i)/ 3 |, r (p+l)/3 

^ B~M (P+m\Z\* I" 1" 1/3 Hiyolll(p+l)/3 



However, for a > 0, 



MII2:= / ^oWdt=(2vra 2 )- a / 2 (2vra 2 /a) 1 / 2 / 7 <WVS(*) d * = i^)^' 2 1 
Jm Jm 



a. 
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Consequently, |wll^ 1)/3 = ^ p+l)/2 ^Trag)^ 1 )/ 3 and IMI^g = (2^)^-^/ y/(p + 1) /3, so 
that 

^ 2-,Q+i) , 2W2 

pp b~m vp+im\* lb7rcJ ° j 

Knowing that (E\Z\p)^ p ^ c VpT^ with c = 8\/2/(9\/e) El, we get 

2~ B 



e < ^ O-B (g+j) 2p ^ 



with d = (9/8)(e^/3) 1 / 2 . 



Appendix H 
Computation of the w fejP 

This section describes a numerical procedure for efficiently computing the p-optimal levels uik, P of a 
Gaussian source A/"(0, 1) for integer p ^ 2, defined by Wfc iP := argmin Ag7 ^ fc £fc, p (A), where £fc )P (A) = 
r^ +1 |t - A| p 70,1 (t) dt. As £ fe)P (A) is strictly convex and differentiable, the desired are the unique 
stationary points satisfying £' kp (^k,p) = 0. 

We compute the io k , P by Newton method, using standard numerical quadrature for £( and £' k ' . 
We handle the semi-infinite bins by replacing t\ = — oo and tg = oo by -39 and +39, respectively 
(chosen as the smallest integer x so that 70,1 (V) = when evaluated in double precision floating point 
arithmetic). Given quadrature weights Cj, we approximate £ ktP by £fc )P (A) = X]£i ^70,1 (^i)!^ ~~ A| p with 
»i = *fc + (« - l)Ax, where Ax = (t k+1 - t k )/(N - 1). We then have £' k;P {\) = Ei=i Ci7o,i(^i)pki - 
A| p_1 sign (xj — A) and £" ktP (\) = Yli=i c i1o,i{xi)p(p— l)|xj — A| p_2 . We initialize with the midpoint for 
each of the finite bins, i.e., set Aj. = (tfe+tfe+i)/2 for 2 ^ k ^ B — 1, and A^ = t%, \g = ts-\ for the 
semi-infinite bins. For each k we then iterate the Newton step \[ n) = \[ n l) - £' 1) )/^k, P ( X< h ^) 
until the convergence criterion |(A£ — A^ _1 )/A^| < 10 _ 15 is met. We used Cj given by the fourth-order 
accurate Simpson's rule, e.g., c = (1, 4, 2, 4 ... 2, 4, l)Ax/3, which yielded empirically observed 0(N~ 4 ) 
convergence of the calculated w kiP . Results in this paper employed N = 10 4 + 1 quadrature points, 
sufficient to yield w k:P accurate to machine precision. 
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